In recent years, nonlinear model predictive control (NMPC) has been extensively used for solving automotive motion control and planning tasks. In order to formulate the NMPC problem, different coordinate systems can be used with different advantages. We propose and compare formulations for the NMPC related optimization problem, involving a Cartesian and a Frenet coordinate frame (CCF/ FCF) in a single nonlinear program (NLP). We specify costs and collision avoidance constraints in the more advantageous coordinate frame, derive appropriate formulations and compare different obstacle constraints. With this approach, we exploit the simpler formulation of opponent vehicle constraints in the CCF, as well as road aligned costs and constraints related to the FCF. Comparisons to other approaches in a simulation framework highlight the advantages of the proposed approaches.
translated by 谷歌翻译
Flexible robots may overcome the industry's major problems: safe human-robot collaboration and increased load-to-mass ratio. However, oscillations and high dimensional state space complicate the control of flexible robots. This work investigates nonlinear model predictive control (NMPC) of flexible robots -- for simultaneous planning and control -- modeled via the rigid finite element method. Although NMPC performs well in simulation, computational complexity prevents its deployment in practice. We show that imitation learning of NMPC with neural networks as function approximator can massively improve the computation time of the controller at the cost of slight performance loss and, more critically, loss of safety guarantees. We leverage a safety filter formulated as a simpler NMPC to recover safety guarantees. Experiments on a simulated three degrees of freedom flexible robot manipulator demonstrate that the average computational time of the proposed safe approximate NMPC controller is 3.6 ms while of the original NMPC is 11.8 ms. Fast and safe approximate NMPC might facilitate the industry's adoption of flexible robots and new solutions for similar problems, e.g., deformable object manipulation and soft robot control.
translated by 谷歌翻译
We present an approach for safe trajectory planning, where a strategic task related to autonomous racing is learned sample-efficient within a simulation environment. A high-level policy, represented as a neural network, outputs a reward specification that is used within the cost function of a parametric nonlinear model predictive controller (NMPC). By including constraints and vehicle kinematics in the NLP, we are able to guarantee safe and feasible trajectories related to the used model. Compared to classical reinforcement learning (RL), our approach restricts the exploration to safe trajectories, starts with a good prior performance and yields full trajectories that can be passed to a tracking lowest-level controller. We do not address the lowest-level controller in this work and assume perfect tracking of feasible trajectories. We show the superior performance of our algorithm on simulated racing tasks that include high-level decision making. The vehicle learns to efficiently overtake slower vehicles and to avoid getting overtaken by blocking faster vehicles.
translated by 谷歌翻译
Besides the recent impressive results on reinforcement learning (RL), safety is still one of the major research challenges in RL. RL is a machine-learning approach to determine near-optimal policies in Markov decision processes (MDPs). In this paper, we consider the setting where the safety-relevant fragment of the MDP together with a temporal logic safety specification is given and many safety violations can be avoided by planning ahead a short time into the future. We propose an approach for online safety shielding of RL agents. During runtime, the shield analyses the safety of each available action. For any action, the shield computes the maximal probability to not violate the safety specification within the next $k$ steps when executing this action. Based on this probability and a given threshold, the shield decides whether to block an action from the agent. Existing offline shielding approaches compute exhaustively the safety of all state-action combinations ahead of time, resulting in huge computation times and large memory consumption. The intuition behind online shielding is to compute at runtime the set of all states that could be reached in the near future. For each of these states, the safety of all available actions is analysed and used for shielding as soon as one of the considered states is reached. Our approach is well suited for high-level planning problems where the time between decisions can be used for safety computations and it is sustainable for the agent to wait until these computations are finished. For our evaluation, we selected a 2-player version of the classical computer game SNAKE. The game represents a high-level planning problem that requires fast decisions and the multiplayer setting induces a large state space, which is computationally expensive to analyse exhaustively.
translated by 谷歌翻译
The short-term prediction of precipitation is critical in many areas of life. Recently, a large body of work was devoted to forecasting radar reflectivity images. The radar images are available only in areas with ground weather radars. Thus, we aim to predict high-resolution precipitation from lower-resolution satellite radiance images. A neural network called WeatherFusionNet is employed to predict severe rain up to eight hours in advance. WeatherFusionNet is a U-Net architecture that fuses three different ways to process the satellite data; predicting future satellite frames, extracting rain information from the current frames, and using the input sequence directly. Using the presented method, we achieved 1st place in the NeurIPS 2022 Weather4Cast Core challenge. The code and trained parameters are available at \url{https://github.com/Datalab-FIT-CTU/weather4cast-2022}.
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
图表和表格之类的数据的可视化表示对于读者来说可能是具有挑战性的。先前的工作表明,将可视化与文本相结合可以改善静态环境中见解的交流,但是关于交互式的洞察力知之甚少。在这项工作中,我们提出了一个NLG聊天机器人,该聊天机器人可以处理自然语言查询,并通过图表和文本的组合提供见解。我们将其应用于营养,域通信质量至关重要。通过人群评估,我们将聊天机器人与传统的静态饮食应用程序的信息性进行比较。我们发现,对话环境可显着提高用户对各种任务中饮食数据的理解,并且用户认为聊天机器人比传统应用程序更有用和快速使用。
translated by 谷歌翻译
机器学习与服务(MLAAS)已成为广泛的范式,即使是通过例如,也是客户可用的最复杂的机器学习模型。一个按要求的原则。这使用户避免了数据收集,超参数调整和模型培训的耗时过程。但是,通过让客户访问(预测)模型,MLAAS提供商危害其知识产权,例如敏感培训数据,优化的超参数或学到的模型参数。对手可以仅使用预测标签创建模型的副本,并以(几乎)相同的行为。尽管已经描述了这种攻击的许多变体,但仅提出了零星的防御策略,以解决孤立的威胁。这增加了对模型窃取领域进行彻底系统化的必要性,以全面了解这些攻击是成功的原因,以及如何全面地捍卫它们。我们通过对模型窃取攻击,评估其性能以及探索不同设置中相应的防御技术来解决这一问题。我们为攻击和防御方法提出了分类法,并提供有关如何根据目标和可用资源选择正确的攻击或防御策略的准则。最后,我们分析了当前攻击策略使哪些防御能力降低。
translated by 谷歌翻译
我们提出了一种新颖的方法来通过使用具有不同个性类型的代理来生成脚本。为了管理脚本中的字符交互,我们采用了模拟的戏剧网络。关于多个标准的自动和人类评估表明,我们的方法的表现优于基于香草-GPT2的基线。我们进一步引入了一个新的指标,以根据自然语言推论评估对话一致性并证明其有效性。
translated by 谷歌翻译
高数据质量对于当今基于AI的系统至关重要。但是,尽管数据质量一直是研究的对象,但显然缺乏对潜在数据质量问题的研究(例如,模棱两可的,无关的价值)。这些问题本质上是潜在的,因此通常不明显。然而,它们可能与基于AI的系统(例如技术债务,数据引起的故障)的未来问题的风险增加有关。作为软件工程中代码气味的对应物,我们指的是数据气味的问题。本文概念化了数据的气味,并在基于AI的系统的背景下的原因,后果,检测和使用。此外,出现了36个数据气味的目录,分为三类(即可信度的气味,可理解的气味,一致性的气味)。此外,该文章概述了用于检测数据气味的工具支持,并提出了240多个现实世界数据集中初始气味检测的结果。
translated by 谷歌翻译